Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 5570 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 435.3 KiB |
| Average record size in memory | 80.0 B |
Variable types
| Categorical | 1 |
|---|---|
| Numeric | 9 |
MUNICIPIO has a high cardinality: 5297 distinct values | High cardinality |
POP_EST is highly correlated with NRO_EMP and 1 other fields | High correlation |
IDHM is highly correlated with PIBCAP | High correlation |
PIBCAP is highly correlated with IDHM | High correlation |
NRO_EMP is highly correlated with POP_EST | High correlation |
MASSA_PCAP is highly correlated with MASSA_PCAP_POP | High correlation |
MASSA_PCAP_POP is highly correlated with MASSA_PCAP | High correlation |
DESP_TOT_RSU is highly correlated with POP_EST | High correlation |
POP_EST is highly correlated with NRO_EMP and 1 other fields | High correlation |
NRO_EMP is highly correlated with POP_EST and 1 other fields | High correlation |
MASSA_PCAP is highly correlated with MASSA_PCAP_POP | High correlation |
MASSA_PCAP_POP is highly correlated with MASSA_PCAP | High correlation |
DESP_TOT_RSU is highly correlated with POP_EST and 1 other fields | High correlation |
IDHM is highly correlated with PIBCAP | High correlation |
PIBCAP is highly correlated with IDHM | High correlation |
MASSA_PCAP is highly correlated with MASSA_PCAP_POP | High correlation |
MASSA_PCAP_POP is highly correlated with MASSA_PCAP | High correlation |
POP_EST is highly correlated with DENS_DEM and 2 other fields | High correlation |
DENS_DEM is highly correlated with POP_EST and 2 other fields | High correlation |
NRO_EMP is highly correlated with POP_EST and 2 other fields | High correlation |
MASSA_PCAP is highly correlated with MASSA_PCAP_POP | High correlation |
MASSA_PCAP_POP is highly correlated with MASSA_PCAP | High correlation |
DESP_TOT_RSU is highly correlated with POP_EST and 2 other fields | High correlation |
POP_EST is highly skewed (γ1 = 37.10739116) | Skewed |
NRO_EMP is highly skewed (γ1 = 34.98315526) | Skewed |
DESP_TOT_RSU is highly skewed (γ1 = 41.42497498) | Skewed |
MUNICIPIO is uniformly distributed | Uniform |
NRO_EMP has 3558 (63.9%) zeros | Zeros |
Reproduction
| Analysis started | 2022-09-10 01:07:23.844761 |
|---|---|
| Analysis finished | 2022-09-10 01:08:09.368260 |
| Duration | 45.52 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 5297 |
|---|---|
| Distinct (%) | 95.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.6 KiB |
| Bom Jesus | 5 |
|---|---|
| São Domingos | 5 |
| São Francisco | 4 |
| Planalto | 4 |
| Santa Helena | 4 |
| Other values (5292) |
Length
| Max length | 32 |
|---|---|
| Median length | 27 |
| Mean length | 11.60771993 |
| Min length | 3 |
Characters and Unicode
| Total characters | 64655 |
|---|---|
| Distinct characters | 71 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 5065 ? |
|---|---|
| Unique (%) | 90.9% |
Sample
| 1st row | Abadia de Goiás |
|---|---|
| 2nd row | Abadia dos Dourados |
| 3rd row | Abadiânia |
| 4th row | Abaeté |
| 5th row | Abaetetuba |
Common Values
| Value | Count | Frequency (%) |
| Bom Jesus | 5 | 0.1% |
| São Domingos | 5 | 0.1% |
| São Francisco | 4 | 0.1% |
| Planalto | 4 | 0.1% |
| Santa Helena | 4 | 0.1% |
| Bonito | 4 | 0.1% |
| Santa Terezinha | 4 | 0.1% |
| Vera Cruz | 4 | 0.1% |
| Santa Inês | 4 | 0.1% |
| Santa Luzia | 4 | 0.1% |
| Other values (5287) | 5528 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| do | 757 | 7.4% |
| são | 364 | 3.5% |
| de | 300 | 2.9% |
| santa | 161 | 1.6% |
| da | 143 | 1.4% |
| nova | 135 | 1.3% |
| sul | 115 | 1.1% |
| rio | 94 | 0.9% |
| dos | 73 | 0.7% |
| josé | 70 | 0.7% |
| Other values (3959) | 8071 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 8789 | 13.6% |
| o | 5959 | 9.2% |
| 4713 | 7.3% | |
| r | 4531 | 7.0% |
| i | 4388 | 6.8% |
| e | 3758 | 5.8% |
| n | 3197 | 4.9% |
| d | 2553 | 3.9% |
| s | 2419 | 3.7% |
| t | 2291 | 3.5% |
| Other values (61) | 22057 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 50859 | |
| Uppercase Letter | 9009 | 13.9% |
| Space Separator | 4713 | 7.3% |
| Other Punctuation | 47 | 0.1% |
| Dash Punctuation | 27 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 8789 | |
| o | 5959 | |
| r | 4531 | |
| i | 4388 | |
| e | 3758 | 7.4% |
| n | 3197 | 6.3% |
| d | 2553 | 5.0% |
| s | 2419 | 4.8% |
| t | 2291 | 4.5% |
| u | 2154 | 4.2% |
| Other values (28) | 10820 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1136 | |
| C | 971 | |
| P | 911 | 10.1% |
| M | 721 | 8.0% |
| A | 697 | 7.7% |
| B | 602 | 6.7% |
| I | 475 | 5.3% |
| J | 405 | 4.5% |
| G | 392 | 4.4% |
| R | 367 | 4.1% |
| Other values (20) | 2332 |
Space Separator
| Value | Count | Frequency (%) |
| 4713 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 47 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 27 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59868 | |
| Common | 4787 | 7.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 8789 | |
| o | 5959 | 10.0% |
| r | 4531 | 7.6% |
| i | 4388 | 7.3% |
| e | 3758 | 6.3% |
| n | 3197 | 5.3% |
| d | 2553 | 4.3% |
| s | 2419 | 4.0% |
| t | 2291 | 3.8% |
| u | 2154 | 3.6% |
| Other values (58) | 19829 |
Common
| Value | Count | Frequency (%) |
| 4713 | ||
| ' | 47 | 1.0% |
| - | 27 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 61812 | |
| None | 2843 | 4.4% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 8789 | |
| o | 5959 | 9.6% |
| 4713 | 7.6% | |
| r | 4531 | 7.3% |
| i | 4388 | 7.1% |
| e | 3758 | 6.1% |
| n | 3197 | 5.2% |
| d | 2553 | 4.1% |
| s | 2419 | 3.9% |
| t | 2291 | 3.7% |
| Other values (44) | 19214 |
None
| Value | Count | Frequency (%) |
| ã | 794 | |
| á | 395 | |
| í | 335 | |
| é | 319 | |
| ç | 268 | 9.4% |
| ó | 243 | 8.5% |
| â | 161 | 5.7% |
| ú | 100 | 3.5% |
| ô | 71 | 2.5% |
| ê | 71 | 2.5% |
| Other values (7) | 86 | 3.0% |
| Distinct | 5110 |
|---|---|
| Distinct (%) | 91.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38297.60126 |
| Minimum | 771 |
|---|---|
| Maximum | 12396372 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 771 |
|---|---|
| 5-th percentile | 2476.45 |
| Q1 | 5454 |
| median | 11732 |
| Q3 | 25764.75 |
| 95-th percentile | 116227.95 |
| Maximum | 12396372 |
| Range | 12395601 |
| Interquartile range (IQR) | 20310.75 |
Descriptive statistics
| Standard deviation | 224288.1528 |
|---|---|
| Coefficient of variation (CV) | 5.856454333 |
| Kurtosis | 1822.798199 |
| Mean | 38297.60126 |
| Median Absolute Deviation (MAD) | 7558.5 |
| Skewness | 37.10739116 |
| Sum | 213317639 |
| Variance | 5.030517549 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 6232 | 4 | 0.1% |
| 2939 | 3 | 0.1% |
| 3861 | 3 | 0.1% |
| 4911 | 3 | 0.1% |
| 5447 | 3 | 0.1% |
| 5646 | 3 | 0.1% |
| 6115 | 3 | 0.1% |
| 16158 | 3 | 0.1% |
| 14415 | 3 | 0.1% |
| 3824 | 3 | 0.1% |
| Other values (5100) | 5539 |
| Value | Count | Frequency (%) |
| 771 | 1 | |
| 839 | 1 | |
| 909 | 1 | |
| 932 | 1 | |
| 1084 | 1 | |
| 1124 | 1 | |
| 1142 | 1 | |
| 1150 | 1 | |
| 1171 | 1 | |
| 1211 | 1 |
| Value | Count | Frequency (%) |
| 12396372 | 1 | |
| 6775561 | 1 | |
| 3094325 | 1 | |
| 2900319 | 1 | |
| 2703391 | 1 | |
| 2530701 | 1 | |
| 2255903 | 1 | |
| 1963726 | 1 | |
| 1661017 | 1 | |
| 1555626 | 1 |
| Distinct | 4033 |
|---|---|
| Distinct (%) | 72.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 108.2024892 |
| Minimum | 0.13 |
|---|---|
| Maximum | 13024.56 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0.13 |
|---|---|
| 5-th percentile | 2.279 |
| Q1 | 11.57 |
| median | 24.4 |
| Q3 | 51.835 |
| 95-th percentile | 249.5835 |
| Maximum | 13024.56 |
| Range | 13024.43 |
| Interquartile range (IQR) | 40.265 |
Descriptive statistics
| Standard deviation | 571.8601176 |
|---|---|
| Coefficient of variation (CV) | 5.285092068 |
| Kurtosis | 226.818646 |
| Mean | 108.2024892 |
| Median Absolute Deviation (MAD) | 15.995 |
| Skewness | 13.59588765 |
| Sum | 602687.865 |
| Variance | 327023.9941 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.31 | 6 | 0.1% |
| 19.32 | 6 | 0.1% |
| 9.95 | 6 | 0.1% |
| 2.79 | 5 | 0.1% |
| 12.57 | 5 | 0.1% |
| 11.06 | 5 | 0.1% |
| 12.79 | 5 | 0.1% |
| 4.67 | 5 | 0.1% |
| 9.17 | 5 | 0.1% |
| 4.03 | 5 | 0.1% |
| Other values (4023) | 5517 |
| Value | Count | Frequency (%) |
| 0.13 | 1 | < 0.1% |
| 0.2 | 1 | < 0.1% |
| 0.21 | 2 | |
| 0.23 | 1 | < 0.1% |
| 0.26 | 2 | |
| 0.28 | 1 | < 0.1% |
| 0.29 | 1 | < 0.1% |
| 0.32 | 1 | < 0.1% |
| 0.33 | 3 | |
| 0.34 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 13024.56 | 1 | |
| 12536.99 | 1 | |
| 11994.31 | 1 | |
| 10698.32 | 1 | |
| 10264.8 | 1 | |
| 9736.03 | 1 | |
| 9063.58 | 1 | |
| 8117.62 | 1 | |
| 7786.44 | 1 | |
| 7398.26 | 1 |
| Distinct | 349 |
|---|---|
| Distinct (%) | 6.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6591572711 |
| Minimum | 0.418 |
|---|---|
| Maximum | 0.862 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0.418 |
|---|---|
| 5-th percentile | 0.544 |
| Q1 | 0.599 |
| median | 0.665 |
| Q3 | 0.718 |
| 95-th percentile | 0.766 |
| Maximum | 0.862 |
| Range | 0.444 |
| Interquartile range (IQR) | 0.119 |
Descriptive statistics
| Standard deviation | 0.07196495405 |
|---|---|
| Coefficient of variation (CV) | 0.1091772134 |
| Kurtosis | -0.8425533263 |
| Mean | 0.6591572711 |
| Median Absolute Deviation (MAD) | 0.058 |
| Skewness | -0.1556693989 |
| Sum | 3671.506 |
| Variance | 0.005178954611 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.71 | 43 | 0.8% |
| 0.592 | 41 | 0.7% |
| 0.701 | 38 | 0.7% |
| 0.725 | 37 | 0.7% |
| 0.718 | 36 | 0.6% |
| 0.706 | 36 | 0.6% |
| 0.699 | 35 | 0.6% |
| 0.704 | 35 | 0.6% |
| 0.721 | 35 | 0.6% |
| 0.697 | 33 | 0.6% |
| Other values (339) | 5201 |
| Value | Count | Frequency (%) |
| 0.418 | 1 | |
| 0.443 | 1 | |
| 0.45 | 1 | |
| 0.452 | 1 | |
| 0.453 | 2 | |
| 0.469 | 1 | |
| 0.471 | 1 | |
| 0.473 | 1 | |
| 0.477 | 1 | |
| 0.479 | 1 |
| Value | Count | Frequency (%) |
| 0.862 | 1 | |
| 0.854 | 1 | |
| 0.847 | 1 | |
| 0.845 | 2 | |
| 0.84 | 1 | |
| 0.837 | 1 | |
| 0.827 | 1 | |
| 0.824 | 1 | |
| 0.823 | 1 | |
| 0.822 | 1 |
| Distinct | 5567 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23513.94173 |
| Minimum | 4788.18 |
|---|---|
| Maximum | 583171.85 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 4788.18 |
|---|---|
| 5-th percentile | 6963.446 |
| Q1 | 9880.37 |
| median | 17433.84 |
| Q3 | 28729.9075 |
| 95-th percentile | 57861.862 |
| Maximum | 583171.85 |
| Range | 578383.67 |
| Interquartile range (IQR) | 18849.5375 |
Descriptive statistics
| Standard deviation | 24238.46308 |
|---|---|
| Coefficient of variation (CV) | 1.030812416 |
| Kurtosis | 95.38033139 |
| Mean | 23513.94173 |
| Median Absolute Deviation (MAD) | 8398.52 |
| Skewness | 6.893354925 |
| Sum | 130972655.4 |
| Variance | 587503092.5 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 9973.26 | 2 | < 0.1% |
| 9572.42 | 2 | < 0.1% |
| 22953.42 | 2 | < 0.1% |
| 26505.89 | 1 | < 0.1% |
| 26471.59 | 1 | < 0.1% |
| 11672.75 | 1 | < 0.1% |
| 8202.59 | 1 | < 0.1% |
| 5636.83 | 1 | < 0.1% |
| 39844.35 | 1 | < 0.1% |
| 31865.01 | 1 | < 0.1% |
| Other values (5557) | 5557 |
| Value | Count | Frequency (%) |
| 4788.18 | 1 | |
| 4901.07 | 1 | |
| 4903.02 | 1 | |
| 4970.45 | 1 | |
| 5062.94 | 1 | |
| 5064.6 | 1 | |
| 5079.92 | 1 | |
| 5200.11 | 1 | |
| 5263.41 | 1 | |
| 5309.7 | 1 |
| Value | Count | Frequency (%) |
| 583171.85 | 1 | |
| 419457.22 | 1 | |
| 362080.4 | 1 | |
| 337288.81 | 1 | |
| 306163.17 | 1 | |
| 304208.49 | 1 | |
| 291967.12 | 1 | |
| 268459.18 | 1 | |
| 229610.7 | 1 | |
| 225290.31 | 1 |
| Distinct | 59 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.69780754 |
| Minimum | 0 |
|---|---|
| Maximum | 532 |
| Zeros | 3558 |
| Zeros (%) | 63.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 7 |
| Maximum | 532 |
| Range | 532 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 9.483517312 |
|---|---|
| Coefficient of variation (CV) | 5.585743429 |
| Kurtosis | 1798.315999 |
| Mean | 1.69780754 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 34.98315526 |
| Sum | 9456.788 |
| Variance | 89.9371006 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 3558 | |
| 1 | 944 | 16.9% |
| 2 | 331 | 5.9% |
| 3 | 187 | 3.4% |
| 4 | 117 | 2.1% |
| 5 | 86 | 1.5% |
| 6 | 53 | 1.0% |
| 7 | 36 | 0.6% |
| 8 | 27 | 0.5% |
| 9 | 25 | 0.4% |
| Other values (49) | 206 | 3.7% |
| Value | Count | Frequency (%) |
| 0 | 3558 | |
| 1 | 944 | 16.9% |
| 1.697 | 4 | 0.1% |
| 2 | 331 | 5.9% |
| 3 | 187 | 3.4% |
| 4 | 117 | 2.1% |
| 5 | 86 | 1.5% |
| 6 | 53 | 1.0% |
| 7 | 36 | 0.6% |
| 8 | 27 | 0.5% |
| Value | Count | Frequency (%) |
| 532 | 1 | |
| 175 | 1 | |
| 129 | 1 | |
| 121 | 1 | |
| 114 | 1 | |
| 103 | 1 | |
| 99 | 1 | |
| 90 | 1 | |
| 71 | 1 | |
| 68 | 1 |
| Distinct | 293 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.016433752 |
| Minimum | 0.03 |
|---|---|
| Maximum | 5.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0.03 |
|---|---|
| 5-th percentile | 0.32 |
| Q1 | 0.66 |
| median | 0.99 |
| Q3 | 1.15 |
| 95-th percentile | 2.11 |
| Maximum | 5.9 |
| Range | 5.87 |
| Interquartile range (IQR) | 0.49 |
Descriptive statistics
| Standard deviation | 0.5526934418 |
|---|---|
| Coefficient of variation (CV) | 0.5437574663 |
| Kurtosis | 4.356521646 |
| Mean | 1.016433752 |
| Median Absolute Deviation (MAD) | 0.28 |
| Skewness | 1.585750541 |
| Sum | 5661.536 |
| Variance | 0.3054700406 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.016 | 981 | 17.6% |
| 0.67 | 60 | 1.1% |
| 0.69 | 58 | 1.0% |
| 0.73 | 56 | 1.0% |
| 0.7 | 55 | 1.0% |
| 0.64 | 54 | 1.0% |
| 0.66 | 54 | 1.0% |
| 0.61 | 53 | 1.0% |
| 0.78 | 53 | 1.0% |
| 0.63 | 53 | 1.0% |
| Other values (283) | 4093 |
| Value | Count | Frequency (%) |
| 0.03 | 1 | < 0.1% |
| 0.1 | 19 | |
| 0.11 | 11 | |
| 0.12 | 11 | |
| 0.13 | 7 | 0.1% |
| 0.14 | 9 | |
| 0.15 | 4 | 0.1% |
| 0.16 | 8 | |
| 0.17 | 2 | < 0.1% |
| 0.18 | 4 | 0.1% |
| Value | Count | Frequency (%) |
| 5.9 | 1 | < 0.1% |
| 5.47 | 1 | < 0.1% |
| 4.43 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 3.88 | 2 | < 0.1% |
| 3.74 | 1 | < 0.1% |
| 3.02 | 1 | < 0.1% |
| 3 | 17 | |
| 2.99 | 9 | |
| 2.98 | 2 | < 0.1% |
CUSTO_UNIM
Real number (ℝ≥0)
| Distinct | 3071 |
|---|---|
| Distinct (%) | 55.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 223.671177 |
| Minimum | 10 |
|---|---|
| Maximum | 3912.29 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 160.4225 |
| median | 223.671 |
| Q3 | 223.671 |
| 95-th percentile | 443.5365 |
| Maximum | 3912.29 |
| Range | 3902.29 |
| Interquartile range (IQR) | 63.2485 |
Descriptive statistics
| Standard deviation | 132.7104988 |
|---|---|
| Coefficient of variation (CV) | 0.5933285665 |
| Kurtosis | 123.5232457 |
| Mean | 223.671177 |
| Median Absolute Deviation (MAD) | 38.175 |
| Skewness | 6.010466398 |
| Sum | 1245848.456 |
| Variance | 17612.0765 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 223.671 | 2266 | |
| 500 | 12 | 0.2% |
| 440 | 12 | 0.2% |
| 50 | 10 | 0.2% |
| 100 | 6 | 0.1% |
| 200 | 6 | 0.1% |
| 66.67 | 4 | 0.1% |
| 312.5 | 4 | 0.1% |
| 333.33 | 4 | 0.1% |
| 250 | 4 | 0.1% |
| Other values (3061) | 3242 |
| Value | Count | Frequency (%) |
| 10 | 1 | |
| 10.56 | 1 | |
| 10.63 | 1 | |
| 11.6 | 1 | |
| 11.76 | 1 | |
| 11.79 | 1 | |
| 12.5 | 2 | |
| 12.72 | 1 | |
| 14.06 | 1 | |
| 14.17 | 1 |
| Value | Count | Frequency (%) |
| 3912.29 | 1 | |
| 2025.46 | 1 | |
| 1700 | 1 | |
| 1648.62 | 1 | |
| 1389.88 | 1 | |
| 1243.89 | 1 | |
| 1227.69 | 1 | |
| 1210.31 | 1 | |
| 1170.34 | 1 | |
| 1139.93 | 1 |
MASSA_PCAP_POP
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 297 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8823504488 |
| Minimum | 0.03 |
|---|---|
| Maximum | 6.62 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0.03 |
|---|---|
| 5-th percentile | 0.26 |
| Q1 | 0.56 |
| median | 0.86 |
| Q3 | 0.99 |
| 95-th percentile | 1.91 |
| Maximum | 6.62 |
| Range | 6.59 |
| Interquartile range (IQR) | 0.43 |
Descriptive statistics
| Standard deviation | 0.5190683693 |
|---|---|
| Coefficient of variation (CV) | 0.5882791469 |
| Kurtosis | 10.98648258 |
| Mean | 0.8823504488 |
| Median Absolute Deviation (MAD) | 0.24 |
| Skewness | 2.307497967 |
| Sum | 4914.692 |
| Variance | 0.269431972 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.882 | 981 | 17.6% |
| 0.65 | 65 | 1.2% |
| 0.68 | 61 | 1.1% |
| 0.56 | 59 | 1.1% |
| 0.64 | 58 | 1.0% |
| 0.6 | 57 | 1.0% |
| 0.47 | 57 | 1.0% |
| 0.48 | 57 | 1.0% |
| 0.66 | 56 | 1.0% |
| 0.69 | 55 | 1.0% |
| Other values (287) | 4064 |
| Value | Count | Frequency (%) |
| 0.03 | 1 | < 0.1% |
| 0.05 | 1 | < 0.1% |
| 0.06 | 2 | < 0.1% |
| 0.07 | 8 | |
| 0.08 | 4 | 0.1% |
| 0.09 | 11 | |
| 0.1 | 19 | |
| 0.11 | 10 | |
| 0.12 | 7 | 0.1% |
| 0.13 | 11 |
| Value | Count | Frequency (%) |
| 6.62 | 1 | |
| 5.9 | 1 | |
| 5.7 | 1 | |
| 5.47 | 1 | |
| 4.52 | 1 | |
| 4.43 | 1 | |
| 4.21 | 1 | |
| 4.07 | 1 | |
| 4 | 1 | |
| 3.99 | 1 |
| Distinct | 4333 |
|---|---|
| Distinct (%) | 77.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5208595.218 |
| Minimum | 12500 |
|---|---|
| Maximum | 2348522611 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 12500 |
|---|---|
| 5-th percentile | 125553.309 |
| Q1 | 390115.2425 |
| median | 1105360 |
| Q3 | 5208595.22 |
| 95-th percentile | 10403362.43 |
| Maximum | 2348522611 |
| Range | 2348510111 |
| Interquartile range (IQR) | 4818479.978 |
Descriptive statistics
| Standard deviation | 46800726.11 |
|---|---|
| Coefficient of variation (CV) | 8.985287617 |
| Kurtosis | 1963.818127 |
| Mean | 5208595.218 |
| Median Absolute Deviation (MAD) | 892147.13 |
| Skewness | 41.42497498 |
| Sum | 2.901187536 × 1010 |
| Variance | 2.190307965 × 1015 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5208595.22 | 981 | 17.6% |
| 300000 | 16 | 0.3% |
| 500000 | 10 | 0.2% |
| 100000 | 9 | 0.2% |
| 250000 | 9 | 0.2% |
| 120000 | 9 | 0.2% |
| 200000 | 9 | 0.2% |
| 1000000 | 8 | 0.1% |
| 600000 | 7 | 0.1% |
| 350000 | 7 | 0.1% |
| Other values (4323) | 4505 |
| Value | Count | Frequency (%) |
| 12500 | 1 | |
| 13600 | 1 | |
| 14400 | 1 | |
| 15000 | 1 | |
| 15672 | 1 | |
| 18066 | 1 | |
| 18822.83 | 1 | |
| 19911.38 | 1 | |
| 22000 | 1 | |
| 22800 | 1 |
| Value | Count | Frequency (%) |
| 2348522611 | 1 | |
| 2174193880 | 1 | |
| 482562163.3 | 1 | |
| 446998774.6 | 1 | |
| 408882336 | 1 | |
| 389926684.6 | 1 | |
| 380258578.7 | 1 | |
| 335564172 | 1 | |
| 309733213 | 1 | |
| 299889089.6 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| MUNICIPIO | POP_EST | DENS_DEM | IDHM | PIBCAP | NRO_EMP | MASSA_PCAP | CUSTO_UNIM | MASSA_PCAP_POP | DESP_TOT_RSU | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Abadia de Goiás | 9158 | 46.850 | 0.708 | 26505.890 | 3.000 | 1.450 | 170.270 | 0.882 | 1188166.840 |
| 1 | Abadia dos Dourados | 7022 | 7.610 | 0.689 | 18353.480 | 1.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 2 | Abadiânia | 20873 | 15.080 | 0.689 | 16132.950 | 3.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 3 | Abaeté | 23263 | 12.490 | 0.698 | 21286.430 | 1.000 | 1.120 | 223.671 | 0.990 | 279000.000 |
| 4 | Abaetetuba | 160439 | 87.610 | 0.628 | 9046.130 | 1.000 | 0.910 | 136.400 | 0.920 | 5613966.000 |
| 5 | Abaiara | 11965 | 58.690 | 0.628 | 7360.500 | 0.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 6 | Abaíra | 8681 | 15.680 | 0.603 | 6794.210 | 0.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 7 | Abaré | 20594 | 11.490 | 0.575 | 6957.040 | 0.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 8 | Abatiá | 7360 | 33.950 | 0.687 | 21529.760 | 0.000 | 1.520 | 223.950 | 1.150 | 897893.520 |
| 9 | Abdon Batista | 2534 | 11.250 | 0.694 | 24517.730 | 0.000 | 0.920 | 223.671 | 0.500 | 131142.920 |
Last rows
| MUNICIPIO | POP_EST | DENS_DEM | IDHM | PIBCAP | NRO_EMP | MASSA_PCAP | CUSTO_UNIM | MASSA_PCAP_POP | DESP_TOT_RSU | |
|---|---|---|---|---|---|---|---|---|---|---|
| 5560 | Xapuri | 19866 | 3.010 | 0.599 | 12553.210 | 0.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 5561 | Xavantina | 3873 | 19.120 | 0.749 | 54550.940 | 0.000 | 1.750 | 458.330 | 1.750 | 421521.640 |
| 5562 | Xaxim | 29254 | 87.670 | 0.752 | 33345.930 | 3.000 | 0.930 | 279.160 | 0.930 | 2613963.770 |
| 5563 | Xexéu | 14789 | 127.180 | 0.552 | 8422.310 | 1.000 | 0.750 | 223.671 | 0.490 | 775099.500 |
| 5564 | Xinguara | 45416 | 10.740 | 0.646 | 27618.270 | 1.000 | 0.780 | 223.671 | 0.750 | 7322501.490 |
| 5565 | Xique-Xique | 46562 | 8.280 | 0.585 | 8442.060 | 0.000 | 1.016 | 223.671 | 0.882 | 5208595.220 |
| 5566 | Zabelê | 2269 | 18.970 | 0.623 | 10724.450 | 0.000 | 1.330 | 223.671 | 1.330 | 756423.910 |
| 5567 | Zacarias | 2784 | 7.320 | 0.729 | 32979.590 | 0.000 | 0.820 | 223.671 | 0.820 | 107400.000 |
| 5568 | Zé Doca | 52190 | 20.770 | 0.595 | 8429.400 | 1.000 | 1.340 | 58.090 | 1.320 | 2131452.000 |
| 5569 | Zortéa | 3432 | 15.770 | 0.761 | 21389.690 | 0.000 | 0.990 | 447.100 | 0.990 | 454051.120 |